#agentic reinforcement learning30/08/2025
rStar2-Agent: How a 14B Agentic RL Model Beats Bigger Models at Math
'Microsoft's rStar2-Agent integrates code execution into the reasoning loop, allowing a 14B model to outperform larger systems on math benchmarks with shorter reasoning traces.'